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The  Investment  Performance  of  U.S.  Equity  Pension  Fund  Managers: 

An  Empirical  Investigation 

ABSTRACT 

This  paper  presents  an  empirical  examination  of  the  selectivity  and  timing  performance  of 
a  sample  of  U.S.  equity  pension  fund  managers.  Regardless  of  the  choice  of  benchmark  portfolio 
or  estimation  model,  the  average  selectivity  measure  is  positive  and  the  average  timing  measure 
is  negative.  However,  selectivity  does  appear  to  be  somewhat  sensitive  to  the  choice  of  a 
benchmark  when  managers  are  classified  by  investment  style.  Meta-analysis  revealed  some  real 
variation  around  the  mean  values  for  each  measure.  The  80%  probability  intervals  for  selectivity 
revealed  that  the  best  managers  produced  substantial  risk-adjusted  excess  returns.  Consistent  with 
previous  studies  of  mutual  fund  performance,  we  also  found  a  negative  correlation  between 
selectivity  and  timing. 


M.I.T. 

JAN  1  J  1992 


The  Investment  Performance  of  U.S.  Equity  Pension  Fund  Managers: 

An  Empirical  Investigation 

Each  year  Pensions  &  Investments,  a  leading  trade  newspaper  for  the  pension 
management  industry,  profiles  the  top  1000  public  and  private  U.S.  pension  funds.  At  year-end 
1990,  these  funds  had  total  pension  assets  of  $1,876  trillion.  Approximately  $750  billion  (40 
percent)  was  invested  in  equities.  The  Investment  Company  Institute  estimates  that  $250  billion 
was  invested  in  open-  and  closed-end  equity-oriented  U.S.  mutual  funds  at  year-end  1990.  This 
snapshot  indicates  a  3:1  ratio  for  pension  equity  investment  versus  mutual  fund  equity 
investment.  Not  only  is  the  dollar  difference  large,  but  also  the  difference  in  the  number  of 
managers  in  each  universe  is  large.  The  total  number  of  pension  fund  managers  is  much  larger 
than  the  number  of  mutual  fund  managers,  by  a  ratio  of  approximately  10:1.  Yet  surprisingly 
little  research  has  been  done  on  the  investment  performance  of  U.S.  equity  pension  fund 
managers.  This  paper  begins  to  fill  an  important  gap  in  the  literature  by  providing  empirical 
evidence  on  the  investment  performance  of  these  managers. 

The  focus  of  this  study  is  on  equity  pension  fund  managers  who  have  been  allocated 
funds  by  a  pension  plan  sponsor.  Brinson,  Hood  and  Beebower  (1986),  Ippolito  and  Turner 
(1987),  and  Berkowtiz,  Finney  and  Logue  (1988)  examined  the  investment  performance  of  a 
sample  of  large  U.S.  pension  plans.  Each  plan  may  be  composed  of  many  fund  managers  in 
different  asset  categories  with  their  own  specific  investment  objectives  and  styles.  To  date,  ours 
is  the  only  study  we  know  of  which  specifically  examines  the  components  of  the  investment 
performance  of  a  sample  of  U.S.  equity  pension  fund  managers. 
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The  two  components  we  examine  are  security  selection  skill  and  market  timing  skill. 
Security  selection  involves  the  identification  of  individual  securities  which  are  under-  or 
overvalued  relative  to  the  market  in  general.  Within  the  specification  of  the  Capital  Asset 
Pricing  Model  (CAPM),  the  investment  manager  attempts  to  identify  securities  with  expected 
returns  which  lie  significantly  off  the  security  market  line.  The  manager  will  then  invest  in  those 
securities  which  offer  an  abnormally  high  risk  premium.  Market  timing  refers  to  forecasts  of 
return  on  the  market  portfolio.  If  the  manager  believes  he  can  forecast  the  market  return,  he  will 
adjust  his  portfolio  risk  level  accordingly. 

According  to  the  efficient  market  hypothesis,  all  active  investment  management  activity 
is  futile.  The  only  rational  investment  choice  for  a  plan  sponsor  is  to  invest  in  a  passively 
managed  market  index.  Hence,  in  an  efficient  market,  plan  sponsors  would  not  rationally  invest 
in  (or  pay  active  management  fees  for)  an  investment  program  which  cannot  outperform  a 
market  index.  However,  there  exists  a  very  large  active  pension  fund  mangement  business  in 
the  United  States.  Our  study  will  shed  some  light  on  the  question  of  whether  or  not  pension 
funds  are  behaving  rationally  to  perpetuate  this  business.  Our  paper  is  organized  as  follows. 
Section  I  presents  the  models  of  selectivity  and  market  timing  used  in  this  paper.  Section  II 
describes  the  data  and  methodology.  Section  III  presents  the  empirical  results.  Section  IV 
presents  a  meta-analysis  of  our  results.  Section  V  discusses  the  results.  Section  VI  concludes  our 
paper. 

I.  Models  of  Selectivity  and  Timing 
It  is  important  that  portfolio  managers  be  evaluated  on  both  selection  ability  and  market 


timing  skill.  Accordingly  it  is  necessary  to  model  timing  and  selectivity  simultaneously.  Jensen 
(1968,  1969)  formulated  a  return-generating  model  to  measure  performance  of  the  managed 
portfolios.   The  model  is: 

Rp,  =  a,  +  Up  R„,  +  Up,  (1) 

where  R^,  is  the  excess  (net  of  risk-free  rate)  return  on  the  pth  portfolio.  R^i  is  the  excess  (net 
of  risk-free  rate)  return  on  the  market  portfolio,  fip  measures  the  sensitivity  of  the  portfolio  to 
the  market  return  and  Up,  is  a  random  error  which  has  expected  value  of  zero,  ap  is  a  measure 
of  security  selection  skill.  This  specification  assumes  that  the  risk  level  of  the  portfolio  under 
consideration  is  stationary  through  time  and  ignores  the  market  timing  skill  of  the  managers. 
Indeed,  f)ortfolio  managers  may  shift  the  overall  risk  composition  of  their  portfolio  in 
anticipation  of  broad  market  price  movements.  Fama  (1972)  and  Jensen  (1972)  addressed  this 
issue  and  suggested  a  somewhat  finer  breakdown  of  performance. 

Treynor  and  Mazuy  (1966)  added  a  quadratic  term  to  equation  (1)  to  test  for  market 
timing  ability.  They  argued  that  if  the  manager  can  forecast  market  returns,  he  will  hold  a 
greater  proportion  of  the  market  portfolio  when  the  return  on  the  market  is  high  and  a  smaller 
proportion  when  the  return  on  the  market  is  low.  Thus,  the  portfolio  return  will  be  a  non-linear 
function  of  the  market  return  as  follows: 

R,,  =  ap  -h  iSpR„,  +  y(R^y-  +  e^,  (2) 

A  positive  value  of  y  would  imply  good  market  timing. 

Jensen  (1972)  developed  a  similar  model  to  detect  selectivity  and  timing  skill  of 
managers.  Jensen's  measure  of  market  timing  performance  calls  for  the  fund  manager  to 
forecast  the  deviation  of  the  market  portfolio  return  from  its  consensus  expected  return.    By 


assuming  that  the  forecasted  return  and  the  actual  return  on  the  market  have  a  joint  normal 
distribution,  Jensen  shows  that,  under  this  assumption,  a  market  timer's  forecasting  ability  can 
be  measured  by  the  correlation  between  the  market  timer's  forecast  and  the  realized  return  on 
the  market.  He  concluded  that,  under  the  above  structure,  the  separate  contributions  of 
selectivity  and  timing  can  not  be  identified  unless,  for  each  period,  the  manager's  forecast  and 
consensus  expected  return  on  the  market  portfolio,  E(R^,  are  known. 

Bhattacharya  and  Pfleiderer  (1983)  extended  the  work  of  Jensen  (1972).  By  correcting 
an  error  made  in  Jensen  (1972),  they  show  that  one  can  use  a  simple  regression  technique  to 
obtain  measures  of  timing  and  selection  ability.  Jensen  assumes  that  the  manager  uses  the 
unadjusted  forecast  of  the  market  return  in  the  timing  decision.  Bhattacharya  and  Pfleiderer 
assume  that  the  manager  adjusts  forecasts  to  minimize  the  variance  of  the  forecast  error.  They 
Sf)ecify  a  relationship  in  terms  of  observable  variables,  which  is  similar  to  the  Treynor  and 
Mazuy  model: 

Rp,  =  a,  +  eE(RJ(\  -  ^)R„,  +  ^e(K.d'  +  e^6,R„,  +  Up,  (3) 

where 

6    =    the  fund  manager's  response  to  information, 

^  =  the  coefficient  of  determination  between  the  manager's  forecast  and  the  excess 
return  on  the  market,  and 

€,  =  the  error  of  the  manager's  forecast 
This  quadratic  regression  of  Rp,  on  R„,  allows  us  to  detect  the  existence  of  stock  selection  ability 
as  revealed  by  a^.    The  disturbance  term  in  equation  (3): 

cj,  =  e^€,R„,  +  Up,  (4) 


contains  the  information  needed  to  quantify  the  manager's  timing  ability.  We  can  extract  this 
information  by  regressing  (wj-  on  (R„,)^: 

(c.,)'  =  &r'(af{R„,y  +  ?.,  (5) 

where 

?,  =  e^^i'^CR^J^LCeJ^  -  {of]  +  (Up^  +  2e^R„,6,Up..  (6) 

The  proposed  regression  produces  a  consistent  estimator  of  6^^Ve,  where  (ct,)^  is  the  variance 
of  the  manager's  forecast  error.  Using  the  consistent  estimator  of  Q^,  which  we  recover  from 
equation  (3),  we  obtain  {a,f.  This,  coupled  with  knowledge  about  (aj^,  the  variance  of  excess 
return  on  the  market,  allows  us  to  estimate  ^  =  (aJ^/[((Tj^  +  (a,)^]  =  p^  where  p  is  the 
correlation  between  the  manager's  forecast  and  excess  return  on  the  market.  Finally,  we 
calculate  p  which  truly  measures  the  quality  of  the  manager's  timing  information. 

The  Bhattacharya  and  Pfleiderer  model  of  equation  (3)  is  a  refinement  of  the  Treynor  and 
Mazuy  model.  It  focuses  on  the  coefficient  of  the  squared  excess  market  return  as  an  indication 
of  timing  skill.  It  is  the  first  model  that  analyzes  the  error  term  to  identify  a  manager's 
forecasting  skill.  Such  a  refinement  should  make  the  model  more  powerful  than  previous  ones. 
Further  detail  and  econometric  issues  relating  to  the  Bhattacharya  and  Pfleiderer  model  are 
discussed  in  Lee  and  Rahman  (1990).  In  the  empirical  tests  reported  in  Section  III,  we  employed 
both  the  Treynor  and  Mazuy  and  the  Bhattacharya  and  Pfleiderer  models.  This  will  allow  us 
to  examine  the  sensitivity  of  results  to  alternative  model  specifications. 

There  are  other  models  in  the  literature  that  permit  identification  and  separation  of 
selectivity  and  timing  skills  of  portfoloio  managers.  These  are  models  by  Grinblatt  and  Titman 
(1989b),  Henriksson  and  Merton  (1981),  and  an  alternative  version  of  the  Henriksson  and 


Merton  mcxlel  by  Kon  and  Jen  (1978,  1979).  The  Grinblatt  and  Titman  model  requires  the 
observation  of  the  historical  sequence  of  portfolio  weights  for  the  manager.  Unfortunately,  data 
on  portfolio  weights  are  very  costly  and  time-consuming  and  often  not  available.  The  Henriksson 
and  Merton  model  provides  no  significant  advantage  over  the  Bhattacharya  and  Pfleiderer  model. 
One  weakness  of  the  Henriksson  and  Merton  model  is  that  information  is  measured  but  there  is 
no  test  of  whether  the  information  is  being  used  correctly.  The  forcasters  in  this  model  are  less 
sophisticated  than  those  of  the  Bhattacharya  and  Pfleiderer  model,  where  they  do  forecast  how 
much  better  the  superior  investment  will  perform.  Henriksson  and  Merton  assume  that  managers 
have  a  coarse  information  structure  in  which  dichotomous  signals  are  only  predictive  of  the  sign 
of  the  excess  return  of  the  market  relative  to  the  risk-free  rate.  In  their  model,  the  probability 
of  receiving  an  "up"  or  a  "down"  signal  in  no  way  depends  upon  how  far  the  market  will  be 
"up"  or  "down." 


II.  Data  and  Methodology 
The  data  for  this  study  consist  of  monthly  returns  for  the  period  January  1983  through 
December  1990  (96  months)  for  a  sample  of  71  U.S.  equity  pension  fund  managers.  Returns 
are  net  of  expenses  and  management  fees.  These  managers  invest  exclusively  in  the  U.S.  equity 
market.  The  data  were  provided  by  the  Frank  Russell  Company  of  Tacoma,  Washington. 
Among  other  services,  the  Frank  Russell  Company  evaluates  the  performance  of  the  managers 
of  a  number  of  pension  funds  throughout  the  United  States.  The  Frank  Russell  Company 
segregates  equity  managers  into  four  basic  investment  styles  on  the  basis  of  managers'  portfolio 
characteristics.    These  are:    (1)  Earnings  Growth,  (2)  Market-Oriented,  (3)  Price-Driven,  and 


(4)  Small  Capitalization.  Our  sample  consists  of  18  Earnings  Growth,  19  Market-Oriented,  18 
Price-Driven,  and  16  Small  Capitalization  managers.  Appendix  I. A  describes  these  four 
investment  styles.  Monthly  observations  for  the  Treasury  bill  rate  was  used  as  a  proxy  for  the 
risk-free  rate. 

Our  study  uses  several  alternative  equity  benchmark  portfolios.  Two  of  these  are  the 
S&P  500  Index  and  the  Russell  3000  Index.  The  Russell  3000  Index  is  a  broad  market  index 
like  the  S&P  500.  Apf)endix  I.B  describes  the  Russell  3000  Index  and  compares  it  to  the  S&P 
500  Index.  In  addition  to  these  two  broad  market  indices,  we  also  use  four  style  indices  as 
benchmarks.  To  be  more  specific,  we  use  separate  benchmarks  for  four  different  investment 
styles.  These  style  indices  are  the  Russell  1000  Index  (for  Market-Oriented  managers),  the 
Russell  2000  index  (for  Small  Cap  managers),  the  Russell  Price-Driven  Index  (for  Price-Driven 
managers),  and  the  Russell  Earnings  Growth  Index  (for  Earnings  Growth  managers).  Appendix 
I.B  describes  these  indices  and  compares  them  to  broad  market  indices.  The  use  of  several 
alternative  indices  will  allow  us  to  examine  the  sensitivity  of  pension  fund  manager's 
performance  to  alternative  benchmarks.  An  estimate  of  the  variance  of  the  excess  return  on  the 
market,  (a,)^,  was  derived  from  observed  returns  for  each  benchmark  following  the  procedure 
of  Lee  and  Rahman  (1990). 

In  the  empirical  test,  it  is  necessary  to  correct  for  heteroscedasticity  in  both  the  Treynor 
and  Mazuy  model  and  the  Bhattacharya  and  Pfleiderer  model.  In  the  Treynor  and  Mazuy 
model,  the  error  term  will  exhibit  conditional  heteroscedasticity  because  of  the  fund  manager's 
attempt  to  time  the  market,  even  though  security  returns  are  assumed  to  be  independent  and 
identically  distributed  through  time.    To  correct  this,  following  Breen,  Jagannathan  and  Ofer 


(1986)  and  Lehmann  and  Modest  (1987),  we  use  heteroscedasticity-consistent  standard  errors 
prof)osed  by  White  (1980),  Hansen  (1982),  and  Hsieh  (1983).  The  significance  tests  reported 
in  Section  III  are  based  on  heteroscedasticity-adjusted  t-statistics. 

In  the  Bhattacharya  and  Pfleiderer  model,  the  procedure  discussed  in  Section  I  does  not 
produce  the  most  efficient  estimates  of  the  parameters  since  the  disturbance  term  in  equations 
(3)  and  (5)  are  heteroscedastic.  More  efficient  estimates  can  be  obtained  by  taking  into  account 
the  heteroscedasticity  of  the  disturbance  terms.  We  followed  a  Generalized  Least  Squares  (GLS) 
procedure,  which  makes  a  correction  for  heteroscedasticity,  to  obtain  efficient  estmates  of 
parameters.    This  methodology  is  more  fully  described  in  Lee  and  Rahman  (1990). 

As  noted  in  Coggin  and  Hunter  (1991),  one  weakness  of  the  Treynor  and  Mazuy  and  the 
Bhattacharya  and  Pfliederer  models  is  that  they  ignore  negative  or  inferior  market  timing.  We 
modify  these  models  to  allow  negative  timing  skill.  We  hypothesize  that  managers  may  exhibit 
negative  ex  post  timing  skill.  In  the  Treynor  and  Mazuy  model,  this  means  the  manager  holds 
a  smaller  portion  of  the  market  portfolio  when  the  market  return  is  high.  In  the  Bhattacharya 
and  Pfleiderer  model,  this  is  indicative  of  a  negative  correlation  between  beta  and  the  market 
return.  Such  results  in  both  models  could  be  due  to  the  inability  of  managers  to  correctly 
forecast  the  expected  return  on  the  market  portfolio.  Hence  these  managers  would  forecast  the 
market  return  to  be  high  when  it  is  actually  low  and  vice  versa.  In  the  Treynor  and  Mazuy 
model  of  equation  (2),  a  negative  value  of  7  would  be  indicative  of  poor  market  timing. 

For  the  Bhattacharya  and  Pfliederer  model,  we  examine  the  sign  of  the  coefficient  of 
(Rrnd^  in  equation  (3).  Intuitively,  in  the  spirit  of  the  Treynor  and  Mazuy  model,  the  sign  of 
this  coefficient  will  be  indicative  of  the  nature  of  timing  skill.     If  the  estimated  value  of  this 
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coefficient  is  negative,  we  designate  timing  skill  given  by  p  to  be  poor  or  inferior.  This 
modification  makes  these  models  more  realistic.  A  similar  adjustment  of  the  Bhattacharya  and 
Pfleiderer  model  was  implicitly  introduced  in  Jagannathan  and  Korajczyk  (1986,  p.  229). 

III.  Empirical  Results 

A.  Significant  Selectivity  and  Timing  Skill 

Table  I  presents  summary  results  from  the  two  models.  In  this  table  and  those  that 
follow,  "S&P  500"  denotes  results  based  on  using  the  S&P  500  as  the  benchmark  portfolio, 
"Russell  3000"  denotes  results  based  on  using  the  Russell  3000  as  the  benchmark  portfolio,  and 
"Style  Index"  denotes  results  based  on  using  each  manager's  appropriate  style  index  as  the 
benchmark  portfolio.  These  results  show  some  evidence  of  positive  security  selection  skill  and 
negative  timing  skill  on  the  part  of  managers.  The  number  of  significant  positive  selectivity 
values  exceeds  the  number  of  significant  negative  selectivity  values  for  both  models  regardless 
of  the  benchmark  used.  For  timing  skill,  the  results  are  just  the  opposite.  For  both  models,  the 
number  of  significant  negative  timing  values  exceeds  the  number  of  significant  positive  timing 
values  regardless  of  the  benchmark  used. 

—  Insert  Table  I  about  here  — 

B.  Mean  Values  of  Performance  Measures 

Table  II  presents  the  means  of  the  selectivity  and  timing  values  for  all  managers  and  for 
the  subsets  of  managers  classified  by  investment  style.    For  the  entire  sample  (All  Managers), 


both  models  show  a  f)ositi\e  mean  selectivity  value  for  all  three  alternative  benchmarks.  These 
values  are  significant  at  the  .05  level  for  two  of  the  three  benchmarks.  For  timing  skill,  the 
results  are  just  the  opposite.  For  the  entire  sample,  both  models  show  a  negati\'e  mean  timing 
value  for  all  three  alternative  benchmarks.  However,  for  only  one  of  the  three  benchmarks  (the 
S&P  500),  the  mean  liming  value  is  significant  at  the  .05  level  for  both  models.  Hence  the 
results  using  the  S&P  500  Index  as  a  benchmark  contrast  with  the  results  obtained  using  the 
Russell  3000  Index  and  the  style  indices  as  benchmarks.  As  shown  in  Appendix  I.B.  the  latter 
two  indices  are  much  more  representative  of  the  managers'  investment  universe  (i.e.,  true 
investment  opportunities)  than  the  former  and,  as  such,  are  more  appropriate  benchmarks  than 
the  former. 

The  results  in  Tables  I  and  II  suggest  that  pension  fund  managers  are  on  average  better  stock 
pickers  than  market  timers.  The  results  that  were  only  hinted  at  in  Table  I  are  now  strongly 
confirmed  in  Table  II.  Our  results  relating  to  selection  skill  are  consistent  with  those  of  Lee  and 
Rahman  (1990),  who  found  some  evidence  of  superior  selection  skill  on  the  part  of  mutual  fund 
managers.  They  also  found  evidence  of  superior  market  timing  skill  for  several  managers. 
However,  it  should  be  pointed  out  that  Lee  and  Rahman  (1990)  ignored  negative  market  timing 
skill  in  their  model,  while  we  allow  negative  market  timing  here.  Our  market  timing  results  are 
consistent  with  those  of  previous  studies  on  mutual  fund  performance  (see  Kon  (1983),  Chang 
and  Lewellen  (1984),  Henriksson  (1984),  Lehmann  and  Modest  (1988),  Cumby  and  Glen  (1990), 
Coggin  and  Hunter  (1991),  and  Connor  and  Korajczyk  (1991)).  These  studies  found  more 
evidence  of  negative  market  timing  than  positive.  These  studies  also  found  some  evidence  of 
negative  selection  skill  for  mutual  funds. 
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There  are  differences  in  the  portfolio  characteristics  and  investment  styles  among  the 
Earnings  Growth,  Market-Oriented,  Price-Driven,  and  Small  Capitalization  managers.  It  is 
therefore  useful  to  examine  performance  measures  for  each  investment  style  separately.  Table 
II  presents  mean  values  of  the  performance  measures  for  each  style  of  manager.  It  also  provides 
the  aggregated  rank  of  each  group.  These  ranks  do  not  vary  between  the  models  for  a  given 
benchmark.    However,  they  do  vary  somewhat  across  benchmarks  for  a  given  model. 

The  period  1983-1990  was  a  period  in  which  the  overall  stock  market  was  up 
substantially.  For  the  eight  years,  the  Russell  3000  grew  at  an  annualized  rate  of  14.17%,  and 
the  S&P  500  grew  at  a  15.60%  rate.  For  the  majority  of  this  period  (up  until  the  end  of  1988) 
the  "value"  investment  style  was  favored  by  the  market  relative  to  other  investment  styles.  Our 
analog  of  this  style  is  the  Price-Driven  index  which  grew  at  an  annualized  rate  of  15.53%.  This 
compares  to  the  "growth"  investment  style  (represented  by  the  Earnings  Growth  index)  which 
grew  at  a  13.72%  rate,  and  the  Small  Capitalization  style  (represented  by  the  Russell  2(XX) 
index)  which  grew  at  a  7.38%  rate.  In  Table  II  we  see  that,  using  the  broad  stock  market  indices 
as  benchmarks,  a  negative  mean  selectivity  value  is  consistently  observed  for  the  growth  and 
small  capitalization  managers.  This  is  consistent  with  the  preference  of  the  stock  market  for  the 
period.  However,  if  we  look  at  the  Style  Index  as  a  benchmark,  we  see  that  these  managers  (as 
well  all  other  styles)  have  positive  selectivity  values.  Thus,  while  we  observe  a  positive  mean 
selectivity  value  across  All  Managers  for  each  benchmark,  it  does  appear  to  make  a  difference 
which  benchmark  portfolio  is  used  (and,  perhaps,  which  time  period)  when  we  move  to  the  level 
of  investment  style. 

— -  Insert  Table  II  about  here  — - 
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C.        Correlation  of  Performance  Measures 

To  examine  the  sensitivity  of  results  to  benchmarks  and  models,  we  also  examine  the 
correlation  of  the  performance  measures  across  models  and  benchmarks.  We  use  two  measures 
of  association  -  the  Pearson  correlation  coefficient  and  the  Spearman  rank  correlation  coefficient. 
Tables  III,  IV,  and  V  provide  correlational  summaries  of  the  results  presented  in  Tables  I  and 
II.  Table  III  represents  the  correlation  of  a  performance  measure  (selectivity  or  timing)  with 
itself  between  benchmarks  for  a  given  model.  All  the  correlations  reported  in  the  table  are 
significant  at  the  .0001  level.  There  is  a  very  high  correlation  between  the  results  based  on  the 
broad  market  indices  -  the  S&P  500  Index  and  the  Russell  3000  Index.  The  performance 
measures  based  on  these  benchmarks  are  somewhat  less  correlated  with  those  based  on  style 
indices.  These  results  are  consistent  for  both  models  and  also  for  both  the  timing  and  selectivity 
measures.  Table  IV  presents  the  correlation  of  a  performance  measure  (selectivity  or  timing) 
with  itself  between  models  for  a  given  benchmark.  These  correlations  are  very  high  and 
significant  at  the  .(XX)!  level.  The  results  in  Tables  III  and  IV  indicate  high  ranking  consistency 
among  benchmarks  and  between  models. 

Finally,  we  present  the  correlation  between  selectivity  and  timing  skill  within  a  model 
for  a  given  benchmark.  These  correlations  are  presented  in  Table  V.  All  these  correlations  are 
significantly  negative.  These  results  indicate  that  good  (poor)  selectivity  is  associated  with  poor 
(good)  timing  ability  regardless  of  the  benchmark  or  model  used.  This  implies  that  fund 
managers  can  not  accomplish  both  selectivity  and  timing  simultaneously.  We  will  have  more 
to  say  about  this  in  Section  V.B. 

— -  Insert  Tables  III,  IV,  V  about  here  — - 
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IV.     Meta-Analysis  of  Results 

Meta-analysis  is  a  statistical  methodology  for  the  cumulation  of  results  across  studies.  The 
contribution  of  meta-analysis  is  to  offer  a  statistical  technique  to  produce  direct  estimates  of  the 
mean  and  standard  deviation  of  population  values.  Thus,  meta-analysis  allows  more  statistically 
powerful  inferences  from  data  than  are  possible  using  more  traditional  disaggregated  analyses. 
Following  its  early  beginings  in  physics  and  psychology,  meta-analysis  has  recently  been  applied 
to  cumulate  results  across  studies  in  several  other  disciplines  including  accounting  (Christie 
(1990)  and  Trotman  and  Wood  (1991)),  finance  (Coggin  and  Hunter  (1983,  1987,  1991)  and 
Dimson  and  Marsh  (1984)),  and  marketing  (Farley  and  Lehman  (1986)).  Recent  comprehensive 
texts  on  meta-analysis  include  Hedges  and  Olkin  (1985)  and  Hunter  and  Schmidt  (1990). 

There  are  a  number  of  "study  artifacts"  which  can  cause  the  results  of  one  study  to  appear 
different  or  even  contradictory  to  those  of  another.  Among  the  more  prominent  artifacts  are 
sampling  error,  error  of  measurement,  and  restriction  of  range  on  the  dependent  variable.  These 
artifacts  are  discussed  in  detail  in  Hunter  and  Schmidt  (1990,  Chapters  2  and  3).  In  this  paper, 
we  focus  on  sampling  error  in  the  regression  values  for  selectivity  and  market  timing  across 
managers.  Meta-analysis  has  been  primarily  developed  for  correlational  data.  However,  the  time 
series  regressions  performed  in  our  paper  have  identical  specifications  (by  performance 
measurement  model)  across  the  sample  of  pension  fund  managers.  Thus,  for  the  purpose  of 
meta-analysis,  we  can  consider  each  of  the  71  managers  as  a  "study,"  cumulate  the  results  and 
apply  meta-analysis.  Appendix  II  to  this  paper  presents  a  brief  discussion  of  the  meta-analysis 
technique  for  regression  coefficients  used  in  this  section. 
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Table  VI  presents  the  results  of  the  meta-analysis  of  the  selectivity  and  timing  coefficients 
based  on  three  benchmark  portfolios  and  using  heteroscedasticity  corrected  t-values.  The  first 
row  of  this  table  gives  the  frequency-weighted  mean  of  the  observed  values  for  each  parameter, 
b;  the  second  row  gives  estimates  of  the  standard  deviation  of  the  observed  values,  Sbi  the  third 
row  gives  estimates  of  the  standard  deviation  of  the  papulation  values,  s^j;  the  fourth  row  gives 
estimates  of  the  frequency-weighted  average  squared  deviation  of  the  observed  values,  Sb^;  the 
fifth  row  gives  estimates  of  the  variance  of  the  population  values,  s/;  the  sixth  row  gives 
estimates  of  the  sampling  error  variance,  s/;  the  seventh  row  gives  the  chi-square  value  for  the 
ratio  of  the  observed  variance  to  the  sampling  error  variance;  and  the  last  row  gives  estimates 
of  the  proportion  of  total  observed  variance  accounted  for  by  sampling  error,  s/ZSb'. 

—  Insert  Table  VI  about  here  — 

A.     Selectivity 

For  selectivity,  the  mean  monthly  values  are  positive  in  every  case  but  very  small. 
However,  on  an  annualized  basis,  these  numbers  become  more  meaningful.  For  the  Bhattacharya 
and  Pfleiderer  model,  the  annualized  mean  selectivity  values  are  .41  %  (S&P  500),  .93%  (Russell 
3000),  and  1.97%  (Style  Index).  For  the  Treynor  and  Mazuy  model,  the  annualized  mean 
selectivity  values  are  .51%  (S&P  500),  .96%  (Russell  3000),  and  1.99%  (Style  Index).  Hence 
we  see  that  for  both  models,  managers  do  better  on  average  relative  to  their  own  style  index  as 
compared  to  the  broader  market  indices.  This  result  is  instructive,  since  much  of  the  common 
investment  wisdom  implies  that  investment  managers  "can't  beat  the  market."     This  result 
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suggests  that  such  a  comment  begs  an  important  question  regarding  which  benchmark  should  be 
used  in  evaluating  a  manager.  We  remind  the  reader  that  these  returns  are  net  of  investment 
management  fees. 

The  chi-square  values  are  significant  at  the  .05  level  or  less  for  the  selectivity  values  using 
all  three  benchmarks  for  both  models.  This  implies  that  there  is  real  variation  (in  excess  of  that 
attributable  to  sampling  error)  around  the  mean  selectivity  value  in  each  case. 

B.  Timing 

For  market  timing,  the  mean  values  are  negative  in  each  case.  This  result  is  consistent  with 
the  results  of  Kon  (1983),  Chang  and  Lewellen  (1984),  Henriksson  (1984),  Grinblatt  and  Titman 
(1988),  Lehmann  and  Modest  (1988),  Cumby  and  Glen  (1990),  Coggin  and  Hunter  (1991),  and 
Connor  and  Korajczyk  (1991)  who  examined  mutual  fund  returns.  Furthermore,  the  chi-square 
values  are  significant  at  the  .10  level  or  less  in  each  case  except  for  the  Bhattacharya  and 
Pfleiderer  model  using  the  S&P  500  benchmark.  Thus  in  almost  every  case  there  is  evidence 
of  real  variation  around  the  negative  mean  timing  value. 

C.  The  80%  Probability  Intervals  for  Selectivity  and  Timing 

If  there  were  no  real  variation  around  the  observed  mean  value,  then  the  observed  mean 
would  be  the  true  value  for  each  of  the  71  managers.  However,  in  our  case,  there  is  evidence 
of  real  variation  in  almost  every  set  of  selectivity  and  market  timing  values.  To  put  these  results 
in  perspective,  we  can  look  at  the  last  row  of  Table  VI  for  each  model  and  examine  the 
proportion  of  total  observed  variance  accounted  for  by  sampling  error.  For  the  Bhattacharya  and 
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Pfleiderer  model,  the  percentage  of  observed  variance  in  selectivity  accounted  for  by  sampling 
error  goes  from  71%  to  57%  to  50%  across  benchmarks;  while  the  percentage  of  variance  in 
timing  accounted  for  by  sampling  error  goes  from  95%  to  81  %  to  68%  across  benchmarks.  For 
the  Treynor  and  Mazuy  model,  the  percentages  for  selectivity  go  from  71%  to  57%  to  50% 
across  benchmarks;  while  the  timing  percentages  go  from  18%  to  17%  to  14%  across 
benchmarks.  We  should  note  that,  as  discussed  in  Hunter  and  Schmidt  (1990),  these  percentages 
of  variance  attributable  to  sampling  error  may  well  contain  other  unaccounted  for  study  artifacts 
(such  as  measurement  error). 

Assuming  selectivity  and  market  timing  to  be  normally  distributed,  we  can  also  examine  the 
80%  probability  intervals  (i.e.,  the  lower  and  upp)er  90%  probability  values)  for  the  spread  of 
the  observed  and  population  values  presented  in  Table  VII.  The  probability  intervals  in  Table 
VII  clearly  show  the  amount  of  variation  in  both  the  observed  and  the  population  values  for 
selectivity  and  market  timing. 

As  noted  above,  there  is  real  variation  in  selectivity  and  timing  values  in  every  case  except 
one  (i.e.,  timing  values  from  the  Bhattacharya  and  Pfleiderer  model  using  the  S&P  500 
benchmark).  The  80%  probability  intervals  for  selectivity  are  all  shifted  towards  positive  values, 
while  the  80%  probability  intervals  for  timing  are  all  shifted  towards  negative  values.  This 
result  is  confirmed  by  the  significance  counts  for  positive  and  negative  selectivity  and  timing 
values  in  Table  I. 

Using  the  80%  probability  intervals  for  the  population  selectivity  values  in  Table  VII,  we 
can  look  at  the  true  spread  in  pension  manager  excess  returns  for  the  two  models  across 
benchmarks.  The  return  for  the  top  10%  of  managers  is  obtained  by  annualizing  the  appropriate 
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upp)er  bound  return  in  Table  VII,  and  the  return  for  the  bottom  10%  of  managers  is  obtained  by 
annualizing  the  appropriate  lower  bound  return  in  Table  VII.  For  the  Bhattacharya  and 
Pfleiderer  model  using  the  S&P  500  benchmark,  the  true  annualized  spread  in  returns  is  4.52% 
(top  10%  =  2.69%,  bottom  10%  =  -1.83%);  using  the  Russell  3000,  the  true  spread  is  5.49% 
(top  10%  =  3.71%,  bottom  10%  =  -1.78);  and  using  the  style  index,  the  true  spread  is  5.44% 
(top  10%  =  4.72%,  bottom  10%  =  -.72%).  For  the  Tryenor  and  Mazuy  model,  the  true 
annualized  spread  in  returns  using  the  S&P  500  benchmark  is  5.01%  (top  10%  =  3.04%, 
bottom  10%  =  -1.97%);  using  the  Russell  3000,  the  true  spread  is  5.55%  (top  10%  =  3.77%, 
bottom  10%  =  -1.78%);  and  using  the  style  index,  the  true  spread  is  5.86%  (top  10%  = 
4.96%,  bottom  10%  =  -.90%).  Hence  there  is  evidence  in  our  data  that  the  best  pension  fund 
managers  can  deliver  substantial  risk-adjusted  excess  returns,  no  matter  which  model  or 
benchmark  we  use.  This  complements  the  results  of  Grinblatt  and  Titman  (1989a),  Ippolito 
(1989),  Lee  and  Rahman  (1990),  and  Coggin  and  Hunter  (1991)  who  found  evidence  of  superior 
performance  in  their  studies  of  mutual  funds. 

—  Insert  Table  VII  about  here  — 

D.     The  Correlation  between  Selectivity  and  Timing 

Looking  at  the  last  line  of  each  panel  in  Table  VI  (S(,^/Sb^),  we  see  that  in  each  case  for  both 
models  the  style  index  benchmark  results  in  the  least  amount  of  sampling  error  in  the  variation 
of  the  selectivity  and  timing  values.  If  we  treat  sampling  error  as  analogous  to  measurement 
error,  then  (adopting  the  language  of  classic  psychometric  reliability  theory)  the  estimates  of 
selectivity  and  market  timing  using  the  style  index  benchmark  have  a  higher  "reliability"  than 
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the  other  estimates.  This  is  consistent  with  our  earlier  observation  that  the  style  indices  are 
more  representative  of  the  managers'  true  investment  universes.  Hunter  and  Schmidt  (1990,  pp. 
115-116)  show  that  the  attenuating  effect  of  sampling  error  on  correlations  is  analogous  to  the 
attenuating  effect  of  measurement  error.  They  then  show  that  observed  correlations  can  thus  be 
corrected  for  sampling  error  in  the  same  way  as  the  psychometric  correction  for  measurement 
error,  or  unreliability,  in  psychometric  terminology. 

In  the  psychometric  reliability  model,  the  reliability  of  variable  x  is  denoted  r„  and  is 
defined  as  a-^lo^\  where  T  =  true  score  and  x=observed  score.  In  the  present  context,  the 
variables  to  be  correlated  are  actually  estimates  of  the  two  parameters,  selectivity  and  market 
timing.  If  we  estimate  the  "reliability"  of  each  parameter  as  s//Sb^  then  we  can  substitute  into 
the  psychometric  two-sided  correction  for  attenuation  formula  (Thomdike  (1982)): 

corrected  corr.  =  observed  corr./  [v/(reliability  of  x)  *  "/(reliability  of  y)]  (7) 

The  observed  correlations  between  selectivity  and  timing  were  given  in  Table  V.  We  can 
now  correct  the  observed  correlations  in  Table  V  for  the  effect  of  sampling  error.  Thus,  for  the 
style  index  benchmark,  we  have  corrected  correlation  =  -.359  /  [v/.500  *'\/.318]  =  -.90  for 
the  Bhattacharya  and  Pfleiderer  model,  and  -.399  /  [v/.500*  V.855]  =  -.61  for  the  Treynor  and 
Mazuy  model.  This  further  confirms  the  results  of  previous  studies  (see  Kon  (1983),  Henriksson 
(1984),  Coggin  and  Hunter  (1991),  and  Connor  and  Korajczyk  (1991)). 

While  we  can  correct  the  observed  correlations  for  sampling  error,  we  cannot  in  any 
uncomplicated  way  correct  for  the  possibility  of  a  negative  correlation  between  the  two  described 
in  Jagannathan  and  Korajczyk  (1986).  They  show  that  it  is  possible  to  observe  a  negative 
correlation  between  selectivity  and  timing  in  a  sample  of  mutual  funds  if  the  common  stocks  held 
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by  the  funds  are  more/less  option-like  than  the  stocks  in  the  market  proxy.  However,  since  our 
finding  of  a  negative  correlation  is  replicated  across  all  benchmark  portfolios  in  Table  V,  we 
believe  it  is  unlikely  that  our  observed  correlations  are  seriously  affected  by  this  problem. 

V.    Discussion 

A.        Sensitivity  of  Results  to  Benchmarks  and  Models 

Our  general  finding  is  that  selectivity  is  positive  and  timing  is  negative  on  average  across 
all  models  and  benchmarks.  The  results  in  Tables  III  and  IV  indicate  that  the  rankings  of  both 
performance  measures  are  not  very  sensitive  to  alternative  benchmarks  and  models  in  our  data. 
However,  we  did  observe  some  sensitivity  to  the  choice  of  a  benchmark  when  we  divided  the 
managers  up  by  investment  style.  These  results  contrast  with  those  of  Lehmann  and  Modest 
(1987)  and  Grinblatt  and  Titman  (1989a). 

It  should  be  pointed  out  that  there  is  a  problem  in  the  Lehmann  and  Modest  (1987) 
analysis.  They  examined  selectivity  in  the  context  of  a  Jensen-like  measure  using  the  CAPM 
and  APT  models.  Market  timing  and  factor  timing  activities  are  not  included  in  their  analysis. 
Market  timing  was  also  ignored  by  Grinblatt  and  Titman  (1989a).  Grant  (1977)  explained  how 
market  timing  actions  will  affect  the  results  of  empirical  tests  that  focus  only  on  selection  skill. 
He  showed  that  market  timing  ability  will  cause  the  observed  regression  estimate  of  selectivity 
to  be  downwardly  biased.  The  results  of  Lee  and  Rahman  (1990)  are  consistent  with  Grant's 
(1977)  contention.  A  similar  conclusion  was  drawn  by  Chang  and  Lewellen  (1984)  and 
Henriksson  (1984).    Moreover,  as  Jensen  (1972),  Admati  and  Ross  (1985),  Dybvig  and  Ross 
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(1985),  and  Grinblatt  and  Titman  (1989b)  have  shown,  the  Jensen-like  measure  may  penalize  the 
performance  of  market  timers. 

B.    Negative  Correlation  Between  Selectivity  and  Timing 

As  discussed  in  Sections  III  and  IV,  we  calculate  a  strongly  negative  correlation  between 
selectivity  and  market  timing  in  our  data.  Furthermore,  this  is  consistent  with  the  results  of 
several  other  studies.  The  literature  on  investment  management  contains  a  number  of  studies 
documenting  the  negative  market  timing  ability  of  mutual  fund  managers  (see  Chua  and 
Woodward  (1986)  for  a  summary  and  extention  of  these  studies).  Ours  is  the  first  study  we 
know  of  which  documents  this  finding  for  f)ension  fund  managers.  While  we  offer  no  formal 
model  here  to  explain  the  negative  correlation,  we  can  offer  some  observations. 

The  job  of  equity  investment  management  can  be  said  to  include  two  separate  tasks:  picking 
stocks  and  timing  the  market.  As  many  studies  have  shown,  each  of  these  jobs  is  very  difficult 
to  do  well  consistently.  Indeed,  we  show  that  only  the  best  managers  do  well  on  either 
dimension  taken  separately.  This  has  resulted  in  many  managers  opting  to  market  to  prospective 
clients  only  one  of  these  skills.  There  is  also  much  anecdotal  evidence  indicating  that  a  growing 
number  of  pension  plan  sponsors  do  not  believe  that  market  timing  is  possible  on  a  consistent 
basis,  and  therefore  do  not  hire  managers  who  attempt  it.  The  strongly  negative  correlation 
between  selectivity  and  timing  in  our  data  suggests  that  those  managers  who  are  good  at 
selectivity  are  not  good  at  timing,  and  those  managers  who  are  good  at  timing  are  not  good  at 
selectivity. 
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This  intuitively  makes  sense,  because  the  two  investment  activities  are  largely  separate  and 
distinct.  However,  recall  that  the  general  functional  form  of  our  estimating  equation  for 
selectivity  and  timing  is  the  nonlinear  Treynor-Mazuy  model.  An  inspection  of  the  standard 
econometric  formulas  quickly  reveals  that  the  sampling  errors  for  the  two  coefficients  in  this 
model  are  negatively  correlated.  This  clearly  contributes  to  the  negative  correlation  between  the 
two.  However,  we  note  that  Connor  and  Korajczyk  (1991)  also  found  a  negative  correlation 
between  selectivity  and  timing  using  a  "new  version  of  the  Henriksson-Merton  model,"  which 
does  not  appear  to  suffer  from  this  problem.  This  suggests  that  our  result  may  not  be  entirely 
arti  factual. 

Finally,  one  needs  to  be  somewhat  concerned  about  the  size  of  the  timing  values.  At  a 
purely  statistical  level,  one  can  assess  the  significance  of  the  timing  values  by  looking  at  the 
t-tests.  However,  in  the  Treynor-Mazuy  model  the  impact  of  timing  on  portfolio  return  is,  in 
effect,  measured  by  multiplying  a  rather  small  decimal  fraction,  7,  by  a  squared  decimal 
fraction,  (R^^)-  .  Thus,  at  the  level  of  actual  portfolio  returns,  there  is  a  relatively  small 
reward/penalty  to  this  activity  in  our  data.  Further  research  in  the  the  area  of  the  measurement 
and  assessment  of  market  timing  would  help  clarify  this  issue. 

C.    Sur\'ivorship  Bias 

The  issue  of  survivorship  bias  is  well  known  in  studies  of  investment  performance.  A  recent 
study  by  Brown,  Goetzmann,  Ibbotson  and  Ross  (1991)  highlights  this  issue  with  regard  to 
performance  measurement.  The  basic  issue  here  is  as  follows.  Our  study  includes  71  pension 
managers  with  complete  data  from   1983  to   1990.     Hence,  any  manager  who  may  have 
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disappeared  through  merger  or  poor  performance  is  not  included  in  our  data.  To  the  extent  that 
our  sample  underrepresents  such  managers,  our  results  are  biased  in  favor  of  more  successful 
managers.  We  do  not  know  the  true  extent  of  this  bias  in  our  results,  but  the  results  in  Grinblatt 
and  Titman  (1989a)  suggest  that  it  is  not  large. 

VI.    Summary  and  Conclusion 

This  paper  presents  an  empirical  examination  of  the  selectivity  and  timing  performance  of 
a  sample  of  U.S.  equity  pension  fund  managers.  Our  major  findings  are  as  follows.  The 
results  on  selectivity  and  timing  are  only  mildly  sensitive  to  the  benchmark  portfolio  or 
estimation  model  used.  Moreover,  regardless  of  the  choice  of  benchmark  portfolio  or 
estimation  model,  the  selectivity  measure  is  positive  on  average;  and  the  timing  measure  is 
negative  on  average.  However,  selectivity  does  appear  to  be  somewhat  sensitive  to  the  choice 
of  a  benchmark  (and,  possibly,  the  time  period)  when  managers  are  classified  by  investment 
style.  In  almost  every  case,  meta-analysis  revealed  some  real  variation  (in  excess  of  that 
attributable  to  sampling  error)  around  the  mean  values  for  each  measure.  An  examination  of 
the  80%  probability  intervals  for  selectivity  revealed  that  the  best  equity  pension  fund  managers 
can  deliver  substantial  risk-adjusted  excess  returns.  Consistent  with  previous  studies  of  mutual 
fund  performance,  we  also  found  a  negative  correlation  between  selectivity  and  timing. 

Much  work  remains  to  be  done  in  this  area.  While  active  equity  managers  are  currently 
losing  ground  to  passively  managed  index  funds,  actively  managed  equities  still  represent  the 
largest  fraction  of  the  equity  component  of  corporate  pension  funds.  We  still  do  not  know  why 
some  active  managers  are  able  to  provide  substantial  risk-adjusted  performance,  while  most 
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cannot.  Identifying  the  characteristics  of  successful  money  managers  should  be  one  focus  of 
future  research.  While  there  are  some  interesting  hypotheses,  we  still  do  not  know  why  there 
is  a  consistently  negative  correlation  between  the  selectivity  and  timing  ability  of  active  equity 
managers.    This  is  another  fertile  area  for  study. 
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Appendix  I 
This  appendix  is  based  on  Haughton  and  Christopherson  (1989). 
A.        Style  Descriptions 

1.  Earnings  Growth:  Earnings  Growth  managers  focus  predominantly  on  earnings  and 
revenue  growth  and  attempt  to  identify  companies  with  above-average  growth  prospects. 
In  general,  two  basic  categories  of  securities  are  owned  by  Earnings  Growth  managers  - 
(a)  companies  with  consistent  above-average  (historical  and  prospective)  profitability  and 
growth,  and  (b)  companies  expected  to  generate  above-average  near-term  earnings 
momentum  based  upon  company,  industry,  or  economic  factors. 

2.  Market-Oriented:  Market-Oriented  managers  are  broadly  diversified  managers  who 
participate  in  all  sectors  of  the  market.  The  portfolios  of  these  managers  may  either  be 
well  diversified,  or  take  meaningful  sector/ factor  bets  relative  to  the  market  toward  both 
growth  and  value  over  time.  Market-Oriented  managers  typically  are  willing  to  consider 
companies  representative  of  the  broad  market  when  seeking  investment  opportunities. 

3.  Price-Driven:  Price-Driven  managers  focus  on  the  price  and  value  characteristics  of  a 
security  in  the  selection  process.  These  managers  buy  stocks  from  the  low  price  portion 
of  the  market,  and  are  sometimes  called  value  or  defensive/yield  managers.  In  general, 
these  managers  focus  on  securities  with  low  valuations  relative  to  the  broad  market. 


24 


4.  Small  Capitalization:  Small  Capitalization  managers  focus  on  small  capitalization  stocks. 
These  companies  may  be  unseasoned  and  rapidly  growing  but  sometimes  are  simply 
small  businesses  with  long  histories.  Typical  characteristics  of  small  capitalization 
portfolios  are  below-market  dividend  yields,  above-market  betas,  and  high  residual  risk 
relative  to  broad  market  indices. 

B,         Description  of  Russell  Indices 

Benchmarks  for  Aggregate  Portfolios 

Russell  3000  Index:    The  Russell  3000  Index  includes  the  top  3000  U.S.  companies 

ranked  by  capitalization.  Haughton  and  Christopherson  (1989)  discussed  two  reasons  for 

choosing  the  Russell  3000  Index  over  the  S&P  500  Index. 

(1)  The  S&P  500  spans  only  75%  of  the  investable  U.S.  equity  market.  As  such,  it 
has  a  large  capitalization  bias  but,  within  large  cap  stocks,  it  excludes  some  large 
companies.  It  also  includes  non-U. S.  companies,  so  it  is  not  strictly  a  U.S. 
equity  market  benchmark.  There  is  no  adjustment  in  the  index  for  cross- 
ownership  of  shares,  resulting  in  the  overweighting  of  certain  companies.  Since 
it  covers  only  500  companies,  it  does  not  reflect  many  of  the  long-term  bets 
managers  take  away  from  the  index. 
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(2)  The  Russell  3000  covers  98%  of  the  investable  U.S.  equity  market.  It  weights 
all  market  sectors  according  to  their  investment  opportunities,  and  is  confined  to 
U.S.  companies  and  hence  has  no  foreign  exposure.  It  is  adjusted  for  cross- 
ownership,  thereby  reflecting  true  investment  opportunities;  and  spans  nearly  all 
of  the  stocks  in  which  a  manager  is  likely  to  invest.  Hence,  the  index  is 
relatively  unbiased. 

Style  Indices 

Broad  market  benchmarks  like  the  S&P  500  and  the  Russell  3000  are  suitable  for 
evaluating  pension  managers  who  use  the  whole  market  as  a  base.  Many  U.S.  equity 
pension  managers  specialize  in  subsets  of  the  market.  As  such,  a  finer  set  of 
performance  benchmarks  that  more  closely  match  the  investment  styles  of  individual 
managers  is  needed  to  ensure  identification  of  elements  attributable  to  investment  styles. 
The  Frank  Russell  Company  maintains  four  style  indices  -  one  for  each  investment  style. 
The  key  fundamental  characteristics  of  each  style  index  are  similar  to  the  equity  profile 
of  a  typical  manager  of  that  style.  This  indicates  that  the  subuniverse  of  stocks  that 
comprise  the  style  indices  contains  the  type  of  stocks  from  which  each  style  of  managers 
would  normally  choose;  i.e.,  they  constitute  rough  "normal"  portfolios.  These  style 
benchmarks  are  much  more  representative  of  the  specialized  managers'  selection 
universes  than  the  broad  market  and  hence  should  provide  better  tools  for  performance 
evaluation.   These  style  indices  are: 
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1.  Russell  1000  Index:  The  Russell  1000  is  the  benchmark  recommended  for 
Market-Oriented  style  managers.  It  is  composed  of  the  top  1000  stocks  in  the 
Russell  3000  Index  ranked  by  capitalization.  Hence,  it  focuses  on  the  broad- 
based  large  cap  segment  of  the  market  and  encompasses  about  90%  of  all  the 
equity  opportunities  in  the  U.S.  equity  market. 

2.  Russell  2000  Index:  The  Russell  2000  is  the  small  cap  benchmark  and  is  useful 
for  evaluating  small  capitalization  managers.  It  is  composed  of  the  smallest  2000 
stocks  in  the  Russell  3000  Index  ranked  by  capitalization.  Of  the  10%  of  the  total 
U.S.  equity  market  comprised  of  small  stocks,  the  Russell  2000  Index  covers 
about  8%. 

3.  Earnings  Growth  Index:  Earnings  Growth  Index  is  an  index  for  Earnings  Growth 
style  managers,  and  is  composed  of  those  securities  in  the  Russell  1000  Index  that 
have  above-average  growth  prospects.  Securities  in  this  style  index  are  weighted 
according  to  their  total  capitalization. 

4.  Price  Driven  Index:  Price  Driven  Index  is  an  index  for  Price  Driven  managers. 
It  is  a  capitalization-weighted  index  composed  of  those  securities  in  the  Russell 
1000  Index  that  have  low  valuations  relative  to  the  broad  market.  "Low 
valuation"  is  defined  by  examining  financial  ratios  such  as  the  P/E  ratio,  dividend 
yield,  the  price/book  ratio,  and  the  price/sales  ratio. 
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Appendix  II 
The  Meta-Analysis  of  Regression  Values 
A.        Theoretical  Meta-Analysis  Parameters 

This  appendix  is  taken  from  a  more  detailed  presentation  given  in  Coggin  and  Hunter 
(1991).  Meta-analysis  was  developed  as  a  methodology  to  cumulate  results  across  studies.  In  this 
appendix,  we  will  use  the  words  "study,"  "manager,"  and  "portfolio"  interchangeably.  We 
initially  assume  that  the  number  of  managers  to  be  analyzed  is  large  enough  that  we  can  ignore 
sampling  error  due  to  a  finite  number  of  managers,  and  concentrate  on  sampling  error  in 
regression  estimates  for  individual  managers.  We  also  assume  that  the  specification  of  each 
regression  equation  is  identical  across  managers.  We  denote  observed  regression  values  as  b, 
population  values  as  /3,  and  sampling  error  as  e.    Thus: 

e  =  b-/3   or   b  =  /3  +  e  (A-1) 

The  average  observed  value  is: 

b  =   ^  +  E  (A-2) 

Across  a  large  number  of  managers,  the  average  error,  e,  will  be  zero;    thus  b=/3. 

Since  we  are  comparing  the  portfolios  of  pension  fund  managers,  we  denote  each  manager 
by  the  subscript  i.    Then: 

bi  =    /3,  +  e,  (A-3) 

Across  portfolios,  /3  and  e  will  be  uncorrelated,  so  that  the  variance  of  observed  values,  o^,  will 
be  larger  than  the  variance  of  population  values,  a^,  by  the  amount  of  sampling  error,  a^^: 

a^  =  a/  +  a,'  (A-4) 

From  equation  (A-4),  the  variance  of  the  population  regression  values  can  be  written  as: 
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o,'  =    o,'  -  o,'  (A-5) 

The  key  to  meta-analysis  is  the  fact  that  the  sampling  error  variance,  a,^  can  be  computed  using 
known  statistical  theory.  Thus  equation  (A-5)  becomes  a  formula  to  compute  the  population 
variance,  o/. 

B.         Estimating  Meta-Analysis  Parameters 

In  the  previous  section,  we  assume  that  the  number  of  studies  to  be  cumulated  is  large. 
Specifically,  this  implies  that  the  observed  variance  of  the  sampling  errors  would  equal  the 
theoretical  sampling  error  variance.  If  the  number  of  studies  is  small,  then  the  observed  variance 
of  the  sampling  errors  will  differ  by  chance  from  the  theoretical  sampling  error  variance.  Hence 
we  use  the  notation  "s""  for  the  estimated  variances  below. 

If  a  population  value  is  assumed  be  constant  across  studies.  Hunter  and  Schmidt  (1990)  show 
that  the  best  estimate  of  that  value  is  its  frequency-weighted  average: 

b  =    E[N,  b,]/  E  N,  (A-6) 

where  b,  is  the  observed  value  in  study  i  and  N,  is  the  number  of  observations  in  study  i.  The 
corresponding  observed  variance  estimate  across  studies  is  the  frequency-weighted  average 
squared  deviation: 

Sb-  =  i:[N,(b,  -  hf-V  E  N.  (A-7) 

The  observed  variance  estimate,  Sb^  is  a  confounding  of  two  sources  of  variation:  variation 
in  population  values  (if  any)  and  variation  in  observed  values  due  to  sampling  error.  Thus  an 
estimate  of  the  variation  in  population  values  can  only  be  obtained  by  correcting  the  observed 
variance  estimate,  s^,^,  for  sampling  error.  Hunter  and  Schmidt  (1990)  show  that  sampling  error 
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across  studies  behaves  like  error  of  measurement,  and  the  resulting  formulas  are  comparable  to 
the  standard  formulas  in  classic  psychometric  measurement  theory  or  reliability  theory. 
From  classic  psychometric  theory  (Thomdike  (1982)),  we  have: 
Observed  value  =  true  value  +  error  of  measurement  (A-8) 

where  the  true  value  and  error  of  measurement  are  uncorrelated.    Hence: 

Observed  variance  =  true  variance  +  error  variance  (A-9) 

In  meta-analysis,  it  is  similarly  true  that  the  population  regression  values,  ^,  ,  and  the 
sampling  error,  e,  ,  are  uncorrelated  across  studies.    Therefore  we  can  write: 
Observed  variance  =  population  variance  +  sampling  error  variance  (A- 10) 

Sb'  =  s/  +  s,'  (A-11) 

The  observed  variance  estimate,  s^^,  is  the  frequency-weighted  average  squared  deviation  defined 
above.    The  sampling  error  variance  estimate  required  by  meta-analysis  is  then: 

s,^  =  E[N,(standard  error  hj^]/  L  N,  (A-12) 

The  population  variance  (sometimes  called  the  "corrected  variance")  can  thus  be  estimated 
as: 

s/  =  s,'  -  s,'  (A- 13) 

Equation  (A- 13)  is  the  fundamental  estimating  equation  for  the  theoretical  values  in  equation 
(A-5). 

The  population  variance  estimate,  s^^  can  be  positive,  negative  or  zero.  If  it  is  negative  or 
zero,  the  inference  is  that  there  is  no  variation  in  observed  values  that  cannot  be  attributed  to 
sampling  error.  That  is,  all  variance  in  observed  values  is  artifactual.  If  the  corrected  variance 
across  studies  is  positive,  it  may  still  be  trivial  in  size. 
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C.  A  Significance  Test  for  Real  Variation  Across  Studies 

The  hypothesis  that  there  is  no  real  variation  in  observed  values  has  a  statistical  test.  The 
ratio  of  the  observed  variance  estimate  to  the  sampling  error  variance  estimate  has  a  chi-square 
distribution  with  k-1  degrees  of  freedom: 

x'  =  kSfcVs,^  (A-14) 

where  k= number  of  studies. 

This  statistic  can  be  used  as  a  formal  test  of  no  variation;  although  if  k  is  large,  it  has  high 
statistical  power  and  may  reject  the  null  hypothesis  given  even  a  trivial  amount  of  real  variation 
(Hedges  and  Olkin  (1985),  Cohen  (1988),  and  Hunter  and  Schmidt  (1990)).  Thus  if  the  chi- 
square  value  is  not  significant,  there  is  strong  evidence  that  there  is  no  real  variation  across 
studies.  However,  if  the  k  studies  are  not  independent,  then  the  power  of  the  chi-square  test  is 
reduced  as  discussed  in  the  next  section. 

D.  Independence 

Given  a  set  of  regression  estimates,  there  is  a  corresponding  set  of  sampling  errors.  In 
the  preceding  discussion,  it  was  assumed  that  the  variance  of  sampling  errors  across  the  studies 
would  itself  differ  only  by  sampling  error  from  the  hypothetical  error  variance  across 
independent  replications.  This  is  true  for  most  applications  of  meta-analysis  and  follows 
immediately  from  the  independence  of  the  estimates  across  studies.  However,  this  is  not  always 
true. 
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In  this  study  the  impact  of  the  market  proxy  is  controlled.  However,  the  portfolios  of  two 
equity  pension  fund  managers  may  overlap.  Hence  the  securities  the  two  portfolios  have  in 
common  will  contribute  their  particular  returns  to  both  portfolio  return  sequences.  The  residuals 
of  those  securities  will  thus  contribute  to  the  residuals  of  the  two  portfolios.  This  means  that 
the  two  portfolios  will  not  have  residual  time  series  that  are  entirely  independent.  Thus  the 
sampling  errors  for  the  two  portfolio  regressions  will  also  be  nonindependent  and  positively 
correlated. 

Consider  the  set  of  sampling  errors  for  two  portfolios.  If  the  correlation  between  errors  is 
r,  then  the  variance  across  portfolios  will  not  be  Var(e),  but  rather  the  product  [(l-r)Var(e)]. 
The  corresponding  formulas  for  meta-analysis  are: 

Var(b)  =  Var(i3)  +  (I-r)Var(e)  (A- 15) 

Var(/3)  =  Var(b)  -  (l-r)Var(e)  (A- 1 6) 

Var(i3)  =  [Var(b)  -  Var(e)]  +  r  Var(e)  (A- 17) 

Thus,  traditional  meta-analysis  formulas  will  underestimate  the  variance  of /3.  In  particular,  the 

variances  for  timing  and  selectivity  estimated  in  this  paper  are  too  low  by  some  amount.   The 

adjusted  formula  for  chi-square  would  thus  be: 

x'  =  k  Var(b)/[(l-r)Var(e)]  (A- 18) 

X-  =  [l/(l-r)][k  Var(b)/Var(e)]  (A- 19) 

Hence,  the  traditional  test  statistic  for  homogeneity  of  regression  values  given  earlier  in  equation 
(A- 14)  would  be  an  underestimate  and  thus  would  have  somewhat  lower  than  optimal  power  to 
detect  departures  from  homogeneity.  Therefore  the  traditional  chi-square  test  would  be  a 
"conservative"  test  for  heterogeneity. 
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The  size  of  the  correlation  between  residuals  for  two  portfolios  depends  on  the  extent  of 
overlap  between  the  portfolios.  Most  equity  pension  managers  invest  in  many  securities  in  an 
effort  to  diversify  risk.  Moreover,  pension  fund  managers  are  independent  of  each  other  and 
typically  differ  significantly  in  management  style,  asset  allocation,  and  rebalancing  of  portfolios. 
Thus,  our  working  hypothesis  is  that  the  overlap  is  small  in  magnitude  and  hence  the  correlation 
r  is  small  enough  to  make  little  difference  in  our  analysis.  While  we  believe  this  hypothesis  to 
be  reasonable,  we  know  of  no  study  of  portfolio  overlap  which  we  could  consult  to  check  its 
validity.  Data  on  individual  securities  held  in  the  managers'  portfolios  were  not  available  to  us 
for  this  study. 
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Table  III 

Correlations  of  a  Performance  Measure  Between  Benchmarks* 

Each  model  was  estimated  for  all  managers  for  the  entire  period  using  each  of  the  three 
benchmark  portfolios.  Panel  A  presents  the  Pearson  and  Spearman  correlations  between 
selectivity  values  for  each  pair  of  benchmark  portfolios  for  each  model.  Panel  B  presents  the 
Pearson  and  Spearman  correlations  between  timing  values  for  each  pair  of  benchmark  portfolios 
for  each  model. 


Panel  A:    Selectivity 


Bhattacharya  &.  Pfleiderer  Model 


Style  Index  S&P  500 

Pearson    Spearman     Pearson     Spearman 


Russell  3000 
Style  Index 


.806 


.744 


.997 

.832 


.994 
.761 


Treynor  and  Mazuy  Model 


Russell  3000 
Style  Index 


.804 

.734 

.996 

.995 

.839 

.764 

'anel  B: 

Timing 

Bhattacharya  &  Pfleiderer  Model 


Style  Index  S&P  500 

Pearson   Spearman     Pearson     Spearman 


Russell  3000 
Style  Index 


.765 


.727 


.982 

.777 


.977 
.739 


Treynor  and  Mazuy  Model 


Russell  3000 
Style  Index 


.775 


.704 


.992 
.721 


.984 

.627 


"All  correlations  are  significant  at  the  .0001  level 
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Table  IV 

Correlation  of  a  Performance  Measure  Between  Models* 

Each  model  was  estimated  for  all  managers  for  the  entire  period  using  each  of  the  three 
benchmark  portfolios.  This  table  presents  the  Pearson  and  Spearman  correlations  between 
selectivity  values  for  each  model  for  each  benchmark,  and  the  Pearson  and  Spearman 
correlations  between  timing  values  for  each  model  for  each  benchmark. 


Benchmark 
Russell  3000 
Style  Index 
S«&P500 


Selectivity 
Pearson         Spearman 


.992 


.991 


.990 


.988 


.990 


.985 


Timing 
Pearson         Spearman 


.901 


.835 


.866 


.923 


.930 


.894 


*A11  correlations  are  significant  at  the  .0001  level 
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Table  V 


Correlation  Between  Selectivity  and  Timing 


Each  mcxlel  was  estimated  for  all  managers  for  the  entire  period  using  each  of  the  three 
benchmark  portfolios.  This  table  presents  the  Pearson  and  Spearman  correlations  between  the 
selectivity  and  timing  values  for  each  model  for  each  benchmark. 


Benchmark 

Russell  3000 
Style  Index 
S&P500 


Bhattacharya  and  Pfleiderer  Model        Treynor  and  Mazuy  Model 
Pearson        Spearman  Pearson       Spearman 

-.447  -.488  -.485  -.427' 


-.359" 


.487 


-.315^ 


-.504 


-.399" 


-.467 


.359^ 


-.387 


•  significant  at  the  .0002  level 
''  significant  at  the  .0006  level 
'  significant  at  the  .0008  level 
''  significant  at  the  .0021  level 
'  significant  at  the  .0075  level 

All  other  correlations  are  significant  at  the  .0001  level 


43 


Table  VI 

Meta-Analysis  Results 

This  table  presents  the  meta-analysis  results  for  the  selectivity  and  timing  values  based  on  the  three  benchmark 
portfolios  and  using  heteroscedasticity-corrected  t-values,  for  the  entire  period  (N  =  71  managers). 

Panel  A:   Bhattacharya  and  Pfleiderer  Model 


Selectivity 

Timing 

S&P  500 

Russell  3000 

Style  Index 

S&P  500 

Russell  3000 

Style  Index 

b 

.000339 

.000769 

.001624 

-.046979 

-.009194 

-.010003 

% 

.002646 

.002646 

.002450 

.105010 

.114070 

.123984 

^ 

.001467 

.001773 

.001740 

.022942 

.050224 

.069973 

%' 

.000007 

.000007 

.000006 

.011027 

.013012 

.015372 

^' 

.000002 

.000003 

.000003 

.000526 

.002523 

.004896 

%' 

.000005 

.000004 

.000003 

.010501 

.010489 

.010476 

X    (df  =  70) 

100.61** 

124.82** 

135.43** 

74.56 

88.07* 

104.18** 

%'K' 

.7143 

.5714 

.5000 

.9523 

.8061 

.6815 

Panel  B:  Treynor  and  Mazuy  Model 


b 


%' 


x'  (df=70) 


Selectivity 

S&P  500 

Russell  3000 

Style  Index 

.000422 

.000796 

.001645 

.002646 

.002646 

.002450 

.001625 

.001793 

.001871 

.000007 

.000007 

.000006 

.000002 

.000003 

.000003 

.000005 

.000004 

.000003 

111.83** 

128.70** 

153.41** 

.7143 

.5714 

.5000 

Timing 

S&P  500 

Russell  3000 

Style  Index 

-.279925 

-.082756 

-.070593 

.635032 

.598273 

.593544 

.573466 

.545419 

.548967 

.403266 

.357931 

.352295 

.328863 

.297482 

.301365 

.074403 

.060449 

.050930 

384.82** 

420.41** 

491.12** 

.1845 

.1689 

.1446 

**  Significant  at  the  .05  level  or  less. 
*  Significant  at  the  .10  level. 
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Table  VII 

80%  Probabilrty  Intervals  for  Observed  and  Population  Selectivity  and  Market  Timing  Values 

This  table  presents  the  80%  probability  intervals  for  the  observed  and  population  values  of  selectivity  and 
market  timing  using  all  managers  for  the  entire  period.  The  observed  values  are  bounded  by  b±  1.28(%), 
and  the  population  values  are  bounded  by  b±  1.28(Sfl). 


Panel  A:    Bhattacharya  and  Pfleiderer  Model 
Observed  Values  Population  Values 


Benchmark 

S&P500 
Russell  3000 
Style  Index 


Selectivity 
Lower        Upper 

-.003121  .003800 
-.002687  .004255 
-.001606      .004854 


Market  Timing 
Lower  Upper 

-.181391  .087434 
-.155203  .136814 
.168703        .148697 


Selectivity 
Lower         Upper 


-.001538 
-.001500 
-.000604 


.002217 
.003038 
.003852 


Market  Timing 
Lower  Upper 

.076345        -.017613 
-.073481  .055092 

-.099569         .079563 


Panel  B:   Treynor  and  Mazuy  Model 
Observed  Values  Population  Values 


Selectivity 
Benchmark       Lower        Upper 

S&P500  -.003019      .003864 

Russell  3000   -.002632      .004223 
Style  Index     -.001622     .004912 


Market  Timing 
Lower         Upf>er 

-1.092766      .532916 

-.848546     .683034 

-.830330      .689144 


Selectivity 
Lower         Upper 

-.001657       .002502 
-.001499       .003091 
-.000750       .004040 


Market  Timing 
Lower  Upper 


-1.013962 

-.780893 

-.773271 


.454111 
.615381 
.632085 
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